CBT Campus' Online Skills Training Courses.

IT Skills

Enterprise Database Systems

Big Data

Big Data Analytics

it_dlbdadj_02_enus

it_dlbdadj_01_enus

Big Data Analytics: Spark for High-speed Big Data Analytics

Course Number:
it_dlbdadj_02_enus

Expected Duration (hours)
0.9

Lesson Objectives

Big Data Analytics: Spark for High-speed Big Data Analytics

discover the key concepts covered in this course
recognize how Spark offers an open-source, scalable, massively parallel, in-memory solution for analytics applications
outline the two main components of the Spark architecture: Resilient Distributed Dataset and Directed Acyclic Graph
describe how Spark is providing business value to Uber
describe how Spark is providing business value to Alibaba
describe how Spark is providing business value to the Healthcare industry
compare and name the main differences between Spark and Hadoop with respect to ease of use, latency, security, and cost
specify in which scenarios and conditions Spark is a better choice than its alternatives
list the main features of Spark, such as loading behavior, file formats, parallelism, cache, data skews
name the most important performance optimization techniques in Apache Spark, such as file format selection, level of parallelism, and API selection
name simple best practices when using Spark, like starting small or resolving skewness
summarize the key concepts covered in this course

Overview/Description
Spark is an open-source, massively parallel, in-memory solution that allows you to run big data analytics pipelines at high speed. Use this course to learn how Apache Spark works and gain an understanding of its architecture. As you progress, investigate the industry-leading examples of Uber and Alibaba to recognize how Spark can add business value to data in many industry types. Moving along, compare the functionality of Spark and Hadoop in relation to use cases, identifying when using Spark is most advantageous. Finally, explore fundamental Spark characteristics, optimization techniques, and best practices. When you've completed this course, you'll have a solid theoretical understanding of how and when to use Apache Spark for specific big data analytics tasks.

Target

Prerequisites: none

Big Data Analytics: Techniques for Big Data Analytics

Course Number:
it_dlbdadj_01_enus

Expected Duration (hours)
0.6

Lesson Objectives

Big Data Analytics: Techniques for Big Data Analytics

discover the key concepts covered in this course
describe the challenges in the current data analytics models and system designs, such as scalability, consistency, reliability, efficiency, and maintainability
name and describe the role of the main layers of big data analytics, from the bottom all the way to the top
specify why unstructured data comes from variable sources and describe how it moves from its origin to storage and gets further analyzed and visualized
define the role of the data processing layer and specify how information captured in the previous layer is processed
define the role of the data storage layer using HDFS as an example of commonly used primary data storage
outline the main pillars and components of big data architecture
describe batch processing, its use cases, and common reasons for using it
outline how stream processing enables quick decision-making by creating actionable real-time insights
define the concept of Lambda architecture and outline its use cases
define the concept of Kappa architecture and outline its use cases
summarize the key concepts covered in this course

Overview/Description
Big data analytics provides a way to turn the vast amounts of data available in today's digital world into valuable insights. For this reason, big data analytics techniques have taken a central place in many businesses' IT infrastructure. These comprise complex processes and multiple stack layers that allow you to transform raw data into visualizations that demonstrate trends or other phenomena. Use this course to explore the basic principles and techniques of big data analytics in a business context. Go through each step of data processing to fully comprehend the big data analytics pipeline. Furthermore, explore various use cases of big data analytics through real-world examples. When you're done with this course, you'll have a foundational comprehension of some of the technologies behind big data and how these can drive business decisions for the better.

Target

Prerequisites: none